Goto

Collaborating Authors

 storage solution


Palantir vs Snowflake: Data warehousing tools comparison

#artificialintelligence

Palantir and Snowflake are data warehousing tools that offer unique methods of interacting with large, non-relational data sets. While Palantir uses private operating system models, Snowflake offers a more conventional, cloud-based warehousing approach. Let's compare the two tools. A next-generation data analysis company, Palantir provides three major software platforms: Foundry, Gotham and Apollo. Palantir specializes in AI-ready operating systems that can enhance decision-making through machine learning and DevOps.


Enabling Real-World AI Deployments at Scale

#artificialintelligence

The tools of AI/ML and big data have a common thread – they need data, and they need a lot of it. Conventional wisdom says the more, the better. Analysts predict global data creation will grow to more than 180 zettabytes by 2025 – and in 2020, the amount of data created and replicated hit a new high of 64.2 zettabytes. That data is extremely valuable – often irreplaceable and sometimes representing one-time or once-in-a-lifetime events. This data needs to be stored safely and securely; and while it's estimated that just a small percentage of this newly created data is retained, the demand for storage capacity continues to grow.


Ceph as a Secret Weapon for HPC

#artificialintelligence

Ceph is open source software defined storage (SDS) designed to provide highly scalable object-, block- and file-based storage under a unified system, setting it apart from other SDS solutions. It allows decoupling data from physical hardware storage, using software abstraction layers, providing scaling and fault management capabilities. As a distributed storage framework, Ceph has typically been used for high bandwidth, medium latency types of applications, such as content delivery, archive storage, or block storage for virtualization. Its inherent scale-out support allows an organization to build large systems as demand increases. Additionally, it supports enterprise-grade features such as erasure coding, thin provisioning, cloning, load-balancing, automated tiering between flash and hard drives, and simplified maintenance and debugging.


What Is Big Data?

#artificialintelligence

To really understand huge information, it is helpful to get some historic background. Here is Gartner's definition, circa 2001 (that is still the go-to expression): Big information is information which contains better variety arriving in increasing quantities and using ever-higher velocity. This is known as the three Vs. To put it differently, large info is bigger, more complicated data sets, especially from new information sources. These data sets are so voluminous that traditional data processing software simply can't manage them.


Optimizing AI and Deep Learning Performance

#artificialintelligence

As AI and deep learning uses skyrocket, organizations are finding they are running these systems on similar resource as they do with high-performance computing (HPC) systems – and wondering if this is the path to peak efficiency. Ostensibly AI and HPC architectures have a lot in common, as AI has evolved into even more data-intensive machine learning (ML) and deep learning (DL) domains (Figure 1). Workloads often require multiple GPU systems as a cluster, and share those systems in a coordinated way among multiple data scientists. Secondly, both AI and HPC workloads require shared access to data at a high level of performance and communicate over a fast RDMA-enabled network. Especially in scientific research, the classic HPC systems nowadays tend to have GPUs added to the compute nodes to have the same cluster suitable for classic HPC and new AI/DL workloads.


It Is Not Just About The Data; It Is About Getting The Data To Where It Is Needed When It Is Needed

#artificialintelligence

There is an old saying that data rises to meet available storage. That is one reason I laughed at the phrase "big data" when it came out. We've always been working with as much data as possible. What has changed now is that the data volumes in artificial intelligence (AI), HPC, and other modern arenas is finally hitting a wall that's not storage space or processing power, it's the bandwidth for the two work together with low latency for the variety of data requests that exist. Newer solutions are coming out in order to address that problem.


Weka IO Cloud-Native Unified Storage Solutions for Entire Data Lifecycle - StorageNewsletter

#artificialintelligence

WekaIO, Inc. announced a transformative cloud-native storage solution underpinned by the fast file system WekaFS, that unifies and simplifies the data pipeline for performance-intensive workloads and accelerated DataOps. The company has developed reference architectures (RAs) with object storage technology providers, like AWS, Cloudian, IBM, Seagate, Quantum, Scality, and others in the firm's Technology Alliance Program, to deliver cost-efficient, cloud-native storage solutions at any scale. And the firm's OEM partnership with Hitachi Vantara will deliver an integrated end-to-end stack solution based on the Hitachi Content Platform. WekaFS provides managing petabytes of data in a single, unified namespace wherever in the pipeline the data is stored, while also delivering the performance to accelerate AI/ML, genomics research, HPC, and HPDA workflows. Manage more petabytes of data cost-effectively and with fewer resources Extending the WekaFS namespace from high-performance flash to an Amazon Simple Storage Service (S3) REST-enabled cloud object storage system is a simpler and more cost-efficient strategy for managing petascale datasets without compromising performance.


microsoft/qlib

#artificialintelligence

Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies. For more details, please refer to our paper "Qlib: An AI-oriented Quantitative Investment Platform". At the module level, Qlib is a platform that consists of the above components. The components are designed as loose-coupled modules and each component could be used stand-alone.


India's Kaagaz Scanner Banks On AI Document Manager To Take CamScanner's Place - Inc42 Media

#artificialintelligence

The Delhi-based company launched the app two weeks before India's ban on Chinese apps and has acquired 400K users till date India's ban on 59 Chinese apps has opened up the Indian market for many native startups, who have the potential to replace these banned apps. With CamScanner, one of the most prominent productivity apps in the market, also banned, Indians are looking for replacements. While Microsoft Office Lens, Adobe Scan and other apps are available, the current wave of adoption for Indian products has brought the spotlight on Kaagaz Scanner. Launched about two weeks before the ban, Kaagaz Scanner is an app built by Sorted AI, which is an year-old file management platform. Kaagaz's launch was driven by the company's realisation that scanning is where the document storage process starts for millions of Indian users. Founder Snehanshu Gandhi said that Indian users scan documents in an app, and they also want to use the app to peruse the document storage solution.


Dell EMC Bets Big on PowerStore as the Future of Storage Infrastructure

#artificialintelligence

In 2020, IT Modernization is key to sustaining growth and scaling operations. The massive rise in Big Data analytics has put intense pressure on the current IT infrastructures. As data and hybrid Clouds grow in size and scale, the demand for containerization and kubernetes is also rising. These have led to a spurt of storage-related innovations not just in the DevOps centers, but also in Edge, AutoML and Enterprise Datacenters, letting containers to move into a more concrete position within on-premise and hybrid Cloud architecture. In 2018, IBM announced the launch of its first NVMe solution, IBM Flash Storage.